Overview

Dataset statistics

Number of variables21
Number of observations1781
Missing cells814
Missing cells (%)2.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory292.3 KiB
Average record size in memory168.1 B

Variable types

Categorical8
Numeric13

Alerts

URL has a high cardinality: 1781 distinct valuesHigh cardinality
SERVER has a high cardinality: 239 distinct valuesHigh cardinality
WHOIS_STATEPRO has a high cardinality: 182 distinct valuesHigh cardinality
WHOIS_REGDATE has a high cardinality: 891 distinct valuesHigh cardinality
WHOIS_UPDATED_DATE has a high cardinality: 594 distinct valuesHigh cardinality
URL_LENGTH is highly overall correlated with NUMBER_SPECIAL_CHARACTERSHigh correlation
NUMBER_SPECIAL_CHARACTERS is highly overall correlated with URL_LENGTHHigh correlation
TCP_CONVERSATION_EXCHANGE is highly overall correlated with DIST_REMOTE_TCP_PORT and 6 other fieldsHigh correlation
DIST_REMOTE_TCP_PORT is highly overall correlated with TCP_CONVERSATION_EXCHANGE and 6 other fieldsHigh correlation
REMOTE_IPS is highly overall correlated with DNS_QUERY_TIMESHigh correlation
APP_BYTES is highly overall correlated with TCP_CONVERSATION_EXCHANGE and 5 other fieldsHigh correlation
SOURCE_APP_PACKETS is highly overall correlated with TCP_CONVERSATION_EXCHANGE and 6 other fieldsHigh correlation
REMOTE_APP_PACKETS is highly overall correlated with TCP_CONVERSATION_EXCHANGE and 6 other fieldsHigh correlation
SOURCE_APP_BYTES is highly overall correlated with TCP_CONVERSATION_EXCHANGE and 4 other fieldsHigh correlation
REMOTE_APP_BYTES is highly overall correlated with TCP_CONVERSATION_EXCHANGE and 5 other fieldsHigh correlation
APP_PACKETS is highly overall correlated with TCP_CONVERSATION_EXCHANGE and 6 other fieldsHigh correlation
DNS_QUERY_TIMES is highly overall correlated with REMOTE_IPSHigh correlation
WHOIS_COUNTRY is highly overall correlated with TypeHigh correlation
Type is highly overall correlated with WHOIS_COUNTRYHigh correlation
CONTENT_LENGTH has 812 (45.6%) missing valuesMissing
DIST_REMOTE_TCP_PORT is highly skewed (γ1 = 21.89093705)Skewed
APP_BYTES is highly skewed (γ1 = 41.9809937)Skewed
REMOTE_APP_BYTES is highly skewed (γ1 = 41.96456556)Skewed
URL is uniformly distributedUniform
URL has unique valuesUnique
TCP_CONVERSATION_EXCHANGE has 657 (36.9%) zerosZeros
DIST_REMOTE_TCP_PORT has 916 (51.4%) zerosZeros
REMOTE_IPS has 657 (36.9%) zerosZeros
APP_BYTES has 657 (36.9%) zerosZeros
SOURCE_APP_PACKETS has 655 (36.8%) zerosZeros
REMOTE_APP_PACKETS has 590 (33.1%) zerosZeros
SOURCE_APP_BYTES has 590 (33.1%) zerosZeros
REMOTE_APP_BYTES has 655 (36.8%) zerosZeros
APP_PACKETS has 655 (36.8%) zerosZeros
DNS_QUERY_TIMES has 976 (54.8%) zerosZeros

Reproduction

Analysis started2022-11-30 18:43:25.654052
Analysis finished2022-11-30 18:44:07.764774
Duration42.11 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

URL
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct1781
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size14.0 KiB
M0_109
 
1
B0_999
 
1
B0_2292
 
1
B0_2168
 
1
B0_2108
 
1
Other values (1776)
1776 

Length

Max length7
Median length6
Mean length6.2453678
Min length4

Characters and Unicode

Total characters11123
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1781 ?
Unique (%)100.0%

Sample

1st rowM0_109
2nd rowB0_2314
3rd rowB0_911
4th rowB0_113
5th rowB0_403

Common Values

ValueCountFrequency (%)
M0_109 1
 
0.1%
B0_999 1
 
0.1%
B0_2292 1
 
0.1%
B0_2168 1
 
0.1%
B0_2108 1
 
0.1%
B0_2053 1
 
0.1%
B0_2035 1
 
0.1%
B0_1400 1
 
0.1%
B0_1297 1
 
0.1%
B0_1278 1
 
0.1%
Other values (1771) 1771
99.4%

Length

2022-11-30T19:44:07.867775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
m0_109 1
 
0.1%
b0_916 1
 
0.1%
b0_911 1
 
0.1%
b0_113 1
 
0.1%
b0_403 1
 
0.1%
b0_2064 1
 
0.1%
b0_462 1
 
0.1%
b0_1128 1
 
0.1%
m2_17 1
 
0.1%
m3_75 1
 
0.1%
Other values (1771) 1771
99.4%

Most occurring characters

ValueCountFrequency (%)
0 2222
20.0%
_ 1781
16.0%
B 1565
14.1%
1 1108
10.0%
2 930
8.4%
3 563
 
5.1%
4 554
 
5.0%
6 447
 
4.0%
5 441
 
4.0%
7 433
 
3.9%
Other values (3) 1079
9.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7561
68.0%
Connector Punctuation 1781
 
16.0%
Uppercase Letter 1781
 
16.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2222
29.4%
1 1108
14.7%
2 930
12.3%
3 563
 
7.4%
4 554
 
7.3%
6 447
 
5.9%
5 441
 
5.8%
7 433
 
5.7%
8 432
 
5.7%
9 431
 
5.7%
Uppercase Letter
ValueCountFrequency (%)
B 1565
87.9%
M 216
 
12.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1781
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9342
84.0%
Latin 1781
 
16.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2222
23.8%
_ 1781
19.1%
1 1108
11.9%
2 930
10.0%
3 563
 
6.0%
4 554
 
5.9%
6 447
 
4.8%
5 441
 
4.7%
7 433
 
4.6%
8 432
 
4.6%
Latin
ValueCountFrequency (%)
B 1565
87.9%
M 216
 
12.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11123
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2222
20.0%
_ 1781
16.0%
B 1565
14.1%
1 1108
10.0%
2 930
8.4%
3 563
 
5.1%
4 554
 
5.0%
6 447
 
4.0%
5 441
 
4.0%
7 433
 
3.9%
Other values (3) 1079
9.7%

URL_LENGTH
Real number (ℝ)

Distinct142
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.961258
Minimum16
Maximum249
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:08.016775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile26
Q139
median49
Q368
95-th percentile110
Maximum249
Range233
Interquartile range (IQR)29

Descriptive statistics

Standard deviation27.555586
Coefficient of variation (CV)0.48376013
Kurtosis5.0208838
Mean56.961258
Median Absolute Deviation (MAD)12
Skewness1.8026857
Sum101448
Variance759.3103
MonotonicityIncreasing
2022-11-30T19:44:08.175810image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39 86
 
4.8%
40 48
 
2.7%
46 44
 
2.5%
42 43
 
2.4%
38 43
 
2.4%
47 41
 
2.3%
45 41
 
2.3%
49 39
 
2.2%
35 37
 
2.1%
44 36
 
2.0%
Other values (132) 1323
74.3%
ValueCountFrequency (%)
16 3
 
0.2%
17 2
 
0.1%
18 2
 
0.1%
19 1
 
0.1%
20 7
0.4%
21 4
 
0.2%
22 9
0.5%
23 15
0.8%
24 14
0.8%
25 12
0.7%
ValueCountFrequency (%)
249 1
0.1%
234 1
0.1%
201 1
0.1%
198 1
0.1%
194 2
0.1%
183 1
0.1%
178 1
0.1%
173 1
0.1%
170 1
0.1%
169 1
0.1%

NUMBER_SPECIAL_CHARACTERS
Real number (ℝ)

Distinct31
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.111735
Minimum5
Maximum43
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:08.314777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile6
Q18
median10
Q313
95-th percentile20
Maximum43
Range38
Interquartile range (IQR)5

Descriptive statistics

Standard deviation4.549896
Coefficient of variation (CV)0.40946765
Kurtosis5.2617134
Mean11.111735
Median Absolute Deviation (MAD)2
Skewness1.8799729
Sum19790
Variance20.701553
MonotonicityNot monotonic
2022-11-30T19:44:08.447779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
9 274
15.4%
8 211
11.8%
11 208
11.7%
10 198
11.1%
7 159
8.9%
6 148
8.3%
12 134
7.5%
13 92
 
5.2%
14 58
 
3.3%
15 50
 
2.8%
Other values (21) 249
14.0%
ValueCountFrequency (%)
5 2
 
0.1%
6 148
8.3%
7 159
8.9%
8 211
11.8%
9 274
15.4%
10 198
11.1%
11 208
11.7%
12 134
7.5%
13 92
 
5.2%
14 58
 
3.3%
ValueCountFrequency (%)
43 1
 
0.1%
40 1
 
0.1%
36 1
 
0.1%
34 3
0.2%
31 2
 
0.1%
30 1
 
0.1%
29 4
0.2%
28 2
 
0.1%
27 6
0.3%
26 7
0.4%

CHARSET
Categorical

Distinct9
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size14.0 KiB
UTF-8
676 
ISO-8859-1
427 
utf-8
379 
us-ascii
155 
iso-8859-1
134 
Other values (4)
 
10

Length

Max length12
Median length5
Mean length6.841662
Min length4

Characters and Unicode

Total characters12185
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.2%

Sample

1st rowiso-8859-1
2nd rowUTF-8
3rd rowus-ascii
4th rowISO-8859-1
5th rowUTF-8

Common Values

ValueCountFrequency (%)
UTF-8 676
38.0%
ISO-8859-1 427
24.0%
utf-8 379
21.3%
us-ascii 155
 
8.7%
iso-8859-1 134
 
7.5%
None 7
 
0.4%
windows-1251 1
 
0.1%
ISO-8859 1
 
0.1%
windows-1252 1
 
0.1%

Length

2022-11-30T19:44:08.593773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-11-30T19:44:08.752773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
utf-8 1055
59.2%
iso-8859-1 561
31.5%
us-ascii 155
 
8.7%
none 7
 
0.4%
windows-1251 1
 
0.1%
iso-8859 1
 
0.1%
windows-1252 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
- 2335
19.2%
8 2179
17.9%
U 676
 
5.5%
F 676
 
5.5%
T 676
 
5.5%
5 564
 
4.6%
1 564
 
4.6%
9 562
 
4.6%
u 534
 
4.4%
s 446
 
3.7%
Other values (15) 2973
24.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3872
31.8%
Uppercase Letter 3319
27.2%
Lowercase Letter 2659
21.8%
Dash Punctuation 2335
19.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 534
20.1%
s 446
16.8%
i 446
16.8%
t 379
14.3%
f 379
14.3%
a 155
 
5.8%
c 155
 
5.8%
o 143
 
5.4%
n 9
 
0.3%
e 7
 
0.3%
Other values (2) 6
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
U 676
20.4%
F 676
20.4%
T 676
20.4%
I 428
12.9%
S 428
12.9%
O 428
12.9%
N 7
 
0.2%
Decimal Number
ValueCountFrequency (%)
8 2179
56.3%
5 564
 
14.6%
1 564
 
14.6%
9 562
 
14.5%
2 3
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 2335
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6207
50.9%
Latin 5978
49.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 676
11.3%
F 676
11.3%
T 676
11.3%
u 534
8.9%
s 446
7.5%
i 446
7.5%
I 428
7.2%
S 428
7.2%
O 428
7.2%
t 379
6.3%
Other values (9) 861
14.4%
Common
ValueCountFrequency (%)
- 2335
37.6%
8 2179
35.1%
5 564
 
9.1%
1 564
 
9.1%
9 562
 
9.1%
2 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12185
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 2335
19.2%
8 2179
17.9%
U 676
 
5.5%
F 676
 
5.5%
T 676
 
5.5%
5 564
 
4.6%
1 564
 
4.6%
9 562
 
4.6%
u 534
 
4.4%
s 446
 
3.7%
Other values (15) 2973
24.4%

SERVER
Categorical

Distinct239
Distinct (%)13.4%
Missing1
Missing (%)0.1%
Memory size14.0 KiB
Apache
386 
nginx
211 
None
175 
Microsoft-HTTPAPI/2.0
113 
cloudflare-nginx
94 
Other values (234)
801 

Length

Max length171
Median length114
Mean length13.748315
Min length2

Characters and Unicode

Total characters24472
Distinct characters72
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique142 ?
Unique (%)8.0%

Sample

1st rownginx
2nd rowApache/2.4.10
3rd rowMicrosoft-HTTPAPI/2.0
4th rownginx
5th rowNone

Common Values

ValueCountFrequency (%)
Apache 386
21.7%
nginx 211
 
11.8%
None 175
 
9.8%
Microsoft-HTTPAPI/2.0 113
 
6.3%
cloudflare-nginx 94
 
5.3%
Microsoft-IIS/7.5 51
 
2.9%
GSE 49
 
2.8%
Server 49
 
2.8%
YouTubeFrontEnd 42
 
2.4%
nginx/1.12.0 36
 
2.0%
Other values (229) 574
32.2%

Length

2022-11-30T19:44:08.981815image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
apache 387
 
16.2%
nginx 216
 
9.0%
none 175
 
7.3%
microsoft-httpapi/2.0 113
 
4.7%
cloudflare-nginx 94
 
3.9%
server 55
 
2.3%
centos 52
 
2.2%
microsoft-iis/7.5 52
 
2.2%
unix 52
 
2.2%
gse 49
 
2.1%
Other values (307) 1142
47.8%

Most occurring characters

ValueCountFrequency (%)
e 1612
 
6.6%
. 1591
 
6.5%
n 1551
 
6.3%
o 1058
 
4.3%
c 1033
 
4.2%
2 950
 
3.9%
i 942
 
3.8%
/ 896
 
3.7%
a 896
 
3.7%
A 843
 
3.4%
Other values (62) 13100
53.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13548
55.4%
Uppercase Letter 3764
 
15.4%
Decimal Number 3060
 
12.5%
Other Punctuation 2492
 
10.2%
Space Separator 611
 
2.5%
Dash Punctuation 424
 
1.7%
Close Punctuation 221
 
0.9%
Open Punctuation 221
 
0.9%
Connector Punctuation 119
 
0.5%
Math Symbol 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1612
11.9%
n 1551
11.4%
o 1058
 
7.8%
c 1033
 
7.6%
i 942
 
7.0%
a 896
 
6.6%
p 840
 
6.2%
h 738
 
5.4%
r 569
 
4.2%
t 565
 
4.2%
Other values (16) 3744
27.6%
Uppercase Letter
ValueCountFrequency (%)
A 843
22.4%
S 508
13.5%
P 395
10.5%
T 310
 
8.2%
I 285
 
7.6%
M 204
 
5.4%
H 193
 
5.1%
N 182
 
4.8%
O 147
 
3.9%
E 96
 
2.6%
Other values (15) 601
16.0%
Decimal Number
ValueCountFrequency (%)
2 950
31.0%
1 701
22.9%
0 384
12.5%
5 239
 
7.8%
4 207
 
6.8%
3 157
 
5.1%
7 109
 
3.6%
8 108
 
3.5%
6 107
 
3.5%
9 98
 
3.2%
Other Punctuation
ValueCountFrequency (%)
. 1591
63.8%
/ 896
36.0%
; 3
 
0.1%
! 1
 
< 0.1%
& 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
611
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 424
100.0%
Close Punctuation
ValueCountFrequency (%)
) 221
100.0%
Open Punctuation
ValueCountFrequency (%)
( 221
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 119
100.0%
Math Symbol
ValueCountFrequency (%)
+ 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17312
70.7%
Common 7160
29.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1612
 
9.3%
n 1551
 
9.0%
o 1058
 
6.1%
c 1033
 
6.0%
i 942
 
5.4%
a 896
 
5.2%
A 843
 
4.9%
p 840
 
4.9%
h 738
 
4.3%
r 569
 
3.3%
Other values (41) 7230
41.8%
Common
ValueCountFrequency (%)
. 1591
22.2%
2 950
13.3%
/ 896
12.5%
1 701
9.8%
611
 
8.5%
- 424
 
5.9%
0 384
 
5.4%
5 239
 
3.3%
) 221
 
3.1%
( 221
 
3.1%
Other values (11) 922
12.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24472
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1612
 
6.6%
. 1591
 
6.5%
n 1551
 
6.3%
o 1058
 
4.3%
c 1033
 
4.2%
2 950
 
3.9%
i 942
 
3.8%
/ 896
 
3.7%
a 896
 
3.7%
A 843
 
3.4%
Other values (62) 13100
53.5%

CONTENT_LENGTH
Real number (ℝ)

Distinct637
Distinct (%)65.7%
Missing812
Missing (%)45.6%
Infinite0
Infinite (%)0.0%
Mean11726.928
Minimum0
Maximum649263
Zeros5
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:09.192775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile162
Q1324
median1853
Q311323
95-th percentile44319.6
Maximum649263
Range649263
Interquartile range (IQR)10999

Descriptive statistics

Standard deviation36391.809
Coefficient of variation (CV)3.1032688
Kurtosis144.64797
Mean11726.928
Median Absolute Deviation (MAD)1691
Skewness10.571179
Sum11363393
Variance1.3243638 × 109
MonotonicityNot monotonic
2022-11-30T19:44:09.394811image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
324 138
 
7.7%
1819 20
 
1.1%
2516 13
 
0.7%
162 12
 
0.7%
345 11
 
0.6%
6748 8
 
0.4%
640 7
 
0.4%
11 7
 
0.4%
257 7
 
0.4%
34 6
 
0.3%
Other values (627) 740
41.5%
(Missing) 812
45.6%
ValueCountFrequency (%)
0 5
0.3%
9 6
0.3%
11 7
0.4%
13 2
 
0.1%
20 2
 
0.1%
21 1
 
0.1%
26 1
 
0.1%
34 6
0.3%
39 3
0.2%
57 1
 
0.1%
ValueCountFrequency (%)
649263 1
0.1%
435494 1
0.1%
420762 1
0.1%
359174 1
0.1%
256306 1
0.1%
246324 1
0.1%
208082 1
0.1%
135444 1
0.1%
124140 1
0.1%
121211 1
0.1%

WHOIS_COUNTRY
Categorical

Distinct49
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size14.0 KiB
US
1103 
None
306 
CA
 
84
ES
 
63
AU
 
35
Other values (44)
190 

Length

Max length14
Median length2
Mean length2.3885458
Min length2

Characters and Unicode

Total characters4254
Distinct characters40
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.6%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowUS
5th rowUS

Common Values

ValueCountFrequency (%)
US 1103
61.9%
None 306
 
17.2%
CA 84
 
4.7%
ES 63
 
3.5%
AU 35
 
2.0%
PA 21
 
1.2%
GB 19
 
1.1%
JP 11
 
0.6%
CN 10
 
0.6%
IN 10
 
0.6%
Other values (39) 119
 
6.7%

Length

2022-11-30T19:44:09.582777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
us 1106
61.9%
none 306
 
17.1%
ca 84
 
4.7%
es 63
 
3.5%
au 35
 
2.0%
pa 21
 
1.2%
gb 19
 
1.1%
jp 11
 
0.6%
cn 10
 
0.6%
in 10
 
0.6%
Other values (38) 122
 
6.8%

Most occurring characters

ValueCountFrequency (%)
S 1178
27.7%
U 1162
27.3%
N 334
 
7.9%
n 308
 
7.2%
e 308
 
7.2%
o 307
 
7.2%
A 147
 
3.5%
C 114
 
2.7%
E 74
 
1.7%
P 37
 
0.9%
Other values (30) 285
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3248
76.4%
Lowercase Letter 965
 
22.7%
Other Punctuation 25
 
0.6%
Space Separator 6
 
0.1%
Open Punctuation 5
 
0.1%
Close Punctuation 5
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 1178
36.3%
U 1162
35.8%
N 334
 
10.3%
A 147
 
4.5%
C 114
 
3.5%
E 74
 
2.3%
P 37
 
1.1%
B 34
 
1.0%
K 30
 
0.9%
G 27
 
0.8%
Other values (12) 111
 
3.4%
Lowercase Letter
ValueCountFrequency (%)
n 308
31.9%
e 308
31.9%
o 307
31.8%
u 19
 
2.0%
s 6
 
0.6%
r 6
 
0.6%
y 2
 
0.2%
p 2
 
0.2%
i 2
 
0.2%
d 2
 
0.2%
Other values (3) 3
 
0.3%
Other Punctuation
ValueCountFrequency (%)
' 20
80.0%
; 5
 
20.0%
Space Separator
ValueCountFrequency (%)
6
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 5
100.0%
Close Punctuation
ValueCountFrequency (%)
] 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4213
99.0%
Common 41
 
1.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1178
28.0%
U 1162
27.6%
N 334
 
7.9%
n 308
 
7.3%
e 308
 
7.3%
o 307
 
7.3%
A 147
 
3.5%
C 114
 
2.7%
E 74
 
1.8%
P 37
 
0.9%
Other values (25) 244
 
5.8%
Common
ValueCountFrequency (%)
' 20
48.8%
6
 
14.6%
[ 5
 
12.2%
] 5
 
12.2%
; 5
 
12.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4254
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1178
27.7%
U 1162
27.3%
N 334
 
7.9%
n 308
 
7.2%
e 308
 
7.2%
o 307
 
7.2%
A 147
 
3.5%
C 114
 
2.7%
E 74
 
1.7%
P 37
 
0.9%
Other values (30) 285
 
6.7%

WHOIS_STATEPRO
Categorical

Distinct182
Distinct (%)10.2%
Missing0
Missing (%)0.0%
Memory size14.0 KiB
CA
372 
None
362 
NY
 
75
WA
 
65
Barcelona
 
62
Other values (177)
845 

Length

Max length20
Median length2
Mean length4.032566
Min length1

Characters and Unicode

Total characters7182
Distinct characters61
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)5.1%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowAK
5th rowTX

Common Values

ValueCountFrequency (%)
CA 372
20.9%
None 362
20.3%
NY 75
 
4.2%
WA 65
 
3.6%
Barcelona 62
 
3.5%
FL 61
 
3.4%
Arizona 58
 
3.3%
California 57
 
3.2%
ON 45
 
2.5%
NV 30
 
1.7%
Other values (172) 594
33.4%

Length

2022-11-30T19:44:09.723777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca 376
20.5%
none 363
19.8%
ny 76
 
4.2%
wa 65
 
3.5%
barcelona 62
 
3.4%
fl 61
 
3.3%
arizona 58
 
3.2%
california 58
 
3.2%
on 45
 
2.5%
nv 30
 
1.6%
Other values (161) 637
34.8%

Most occurring characters

ValueCountFrequency (%)
n 710
 
9.9%
A 697
 
9.7%
o 676
 
9.4%
N 612
 
8.5%
e 575
 
8.0%
a 507
 
7.1%
C 496
 
6.9%
i 319
 
4.4%
r 265
 
3.7%
l 192
 
2.7%
Other values (51) 2133
29.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4039
56.2%
Uppercase Letter 3069
42.7%
Space Separator 50
 
0.7%
Decimal Number 12
 
0.2%
Dash Punctuation 8
 
0.1%
Other Punctuation 4
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 710
17.6%
o 676
16.7%
e 575
14.2%
a 507
12.6%
i 319
7.9%
r 265
 
6.6%
l 192
 
4.8%
s 135
 
3.3%
c 100
 
2.5%
t 71
 
1.8%
Other values (16) 489
12.1%
Uppercase Letter
ValueCountFrequency (%)
A 697
22.7%
N 612
19.9%
C 496
16.2%
O 142
 
4.6%
M 107
 
3.5%
L 107
 
3.5%
W 99
 
3.2%
Y 91
 
3.0%
B 81
 
2.6%
T 77
 
2.5%
Other values (16) 560
18.2%
Decimal Number
ValueCountFrequency (%)
1 7
58.3%
0 2
 
16.7%
3 1
 
8.3%
6 1
 
8.3%
2 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
. 3
75.0%
@ 1
 
25.0%
Space Separator
ValueCountFrequency (%)
50
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7108
99.0%
Common 74
 
1.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 710
 
10.0%
A 697
 
9.8%
o 676
 
9.5%
N 612
 
8.6%
e 575
 
8.1%
a 507
 
7.1%
C 496
 
7.0%
i 319
 
4.5%
r 265
 
3.7%
l 192
 
2.7%
Other values (42) 2059
29.0%
Common
ValueCountFrequency (%)
50
67.6%
- 8
 
10.8%
1 7
 
9.5%
. 3
 
4.1%
0 2
 
2.7%
@ 1
 
1.4%
3 1
 
1.4%
6 1
 
1.4%
2 1
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7182
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 710
 
9.9%
A 697
 
9.7%
o 676
 
9.4%
N 612
 
8.5%
e 575
 
8.0%
a 507
 
7.1%
C 496
 
6.9%
i 319
 
4.4%
r 265
 
3.7%
l 192
 
2.7%
Other values (51) 2133
29.7%

WHOIS_REGDATE
Categorical

Distinct891
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size14.0 KiB
None
 
127
17/09/2008 0:00
 
62
13/01/2001 0:12
 
59
31/07/2000 0:00
 
47
15/02/2005 0:00
 
41
Other values (886)
1445 

Length

Max length22
Median length15
Mean length13.982594
Min length1

Characters and Unicode

Total characters24903
Distinct characters22
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique701 ?
Unique (%)39.4%

Sample

1st row10/10/2015 18:21
2nd rowNone
3rd rowNone
4th row7/10/1997 4:00
5th row12/05/1996 0:00

Common Values

ValueCountFrequency (%)
None 127
 
7.1%
17/09/2008 0:00 62
 
3.5%
13/01/2001 0:12 59
 
3.3%
31/07/2000 0:00 47
 
2.6%
15/02/2005 0:00 41
 
2.3%
29/03/1997 0:00 33
 
1.9%
1/11/1994 0:00 30
 
1.7%
18/01/1995 0:00 25
 
1.4%
2/11/2002 0:00 21
 
1.2%
16/05/1995 0:00 17
 
1.0%
Other values (881) 1319
74.1%

Length

2022-11-30T19:44:09.851774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0:00 1470
42.9%
none 127
 
3.7%
17/09/2008 62
 
1.8%
13/01/2001 59
 
1.7%
0:12 59
 
1.7%
31/07/2000 47
 
1.4%
15/02/2005 41
 
1.2%
29/03/1997 33
 
1.0%
1/11/1994 30
 
0.9%
18/01/1995 25
 
0.7%
Other values (941) 1474
43.0%

Most occurring characters

ValueCountFrequency (%)
0 8294
33.3%
/ 3292
 
13.2%
1 2485
 
10.0%
2 2196
 
8.8%
9 1701
 
6.8%
: 1656
 
6.6%
1646
 
6.6%
3 605
 
2.4%
5 592
 
2.4%
7 490
 
2.0%
Other values (12) 1946
 
7.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 17774
71.4%
Other Punctuation 4953
 
19.9%
Space Separator 1646
 
6.6%
Lowercase Letter 383
 
1.5%
Uppercase Letter 137
 
0.6%
Dash Punctuation 10
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8294
46.7%
1 2485
 
14.0%
2 2196
 
12.4%
9 1701
 
9.6%
3 605
 
3.4%
5 592
 
3.3%
7 490
 
2.8%
8 484
 
2.7%
6 480
 
2.7%
4 447
 
2.5%
Lowercase Letter
ValueCountFrequency (%)
o 127
33.2%
e 127
33.2%
n 127
33.2%
b 2
 
0.5%
Other Punctuation
ValueCountFrequency (%)
/ 3292
66.5%
: 1656
33.4%
. 5
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
N 127
92.7%
T 5
 
3.6%
Z 5
 
3.6%
Space Separator
ValueCountFrequency (%)
1646
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24383
97.9%
Latin 520
 
2.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8294
34.0%
/ 3292
 
13.5%
1 2485
 
10.2%
2 2196
 
9.0%
9 1701
 
7.0%
: 1656
 
6.8%
1646
 
6.8%
3 605
 
2.5%
5 592
 
2.4%
7 490
 
2.0%
Other values (5) 1426
 
5.8%
Latin
ValueCountFrequency (%)
N 127
24.4%
o 127
24.4%
e 127
24.4%
n 127
24.4%
T 5
 
1.0%
Z 5
 
1.0%
b 2
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24903
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8294
33.3%
/ 3292
 
13.2%
1 2485
 
10.0%
2 2196
 
8.8%
9 1701
 
6.8%
: 1656
 
6.6%
1646
 
6.6%
3 605
 
2.4%
5 592
 
2.4%
7 490
 
2.0%
Other values (12) 1946
 
7.8%
Distinct594
Distinct (%)33.4%
Missing0
Missing (%)0.0%
Memory size14.0 KiB
None
 
139
2/09/2016 0:00
 
64
12/12/2015 10:16
 
59
29/06/2016 0:00
 
47
14/01/2017 0:00
 
42
Other values (589)
1430 

Length

Max length22
Median length15
Mean length13.947221
Min length4

Characters and Unicode

Total characters24840
Distinct characters21
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique341 ?
Unique (%)19.1%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th row12/09/2013 0:45
5th row11/04/2017 0:00

Common Values

ValueCountFrequency (%)
None 139
 
7.8%
2/09/2016 0:00 64
 
3.6%
12/12/2015 10:16 59
 
3.3%
29/06/2016 0:00 47
 
2.6%
14/01/2017 0:00 42
 
2.4%
29/11/2016 0:00 36
 
2.0%
26/08/2015 0:00 31
 
1.7%
21/10/2016 0:00 30
 
1.7%
30/04/2014 0:00 29
 
1.6%
3/03/2017 0:00 27
 
1.5%
Other values (584) 1277
71.7%

Length

2022-11-30T19:44:09.969775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0:00 1472
43.1%
none 139
 
4.1%
2/09/2016 64
 
1.9%
12/12/2015 59
 
1.7%
10:16 59
 
1.7%
29/06/2016 47
 
1.4%
14/01/2017 43
 
1.3%
29/11/2016 36
 
1.1%
26/08/2015 31
 
0.9%
21/10/2016 30
 
0.9%
Other values (617) 1438
42.1%

Most occurring characters

ValueCountFrequency (%)
0 7651
30.8%
/ 3274
13.2%
1 3203
12.9%
2 2844
 
11.4%
: 1647
 
6.6%
1637
 
6.6%
6 1108
 
4.5%
7 668
 
2.7%
5 560
 
2.3%
4 515
 
2.1%
Other values (11) 1733
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 17701
71.3%
Other Punctuation 4926
 
19.8%
Space Separator 1637
 
6.6%
Lowercase Letter 417
 
1.7%
Uppercase Letter 149
 
0.6%
Dash Punctuation 10
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7651
43.2%
1 3203
18.1%
2 2844
 
16.1%
6 1108
 
6.3%
7 668
 
3.8%
5 560
 
3.2%
4 515
 
2.9%
3 505
 
2.9%
9 366
 
2.1%
8 281
 
1.6%
Other Punctuation
ValueCountFrequency (%)
/ 3274
66.5%
: 1647
33.4%
. 5
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
N 139
93.3%
T 5
 
3.4%
Z 5
 
3.4%
Lowercase Letter
ValueCountFrequency (%)
o 139
33.3%
e 139
33.3%
n 139
33.3%
Space Separator
ValueCountFrequency (%)
1637
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24274
97.7%
Latin 566
 
2.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7651
31.5%
/ 3274
13.5%
1 3203
13.2%
2 2844
 
11.7%
: 1647
 
6.8%
1637
 
6.7%
6 1108
 
4.6%
7 668
 
2.8%
5 560
 
2.3%
4 515
 
2.1%
Other values (5) 1167
 
4.8%
Latin
ValueCountFrequency (%)
N 139
24.6%
o 139
24.6%
e 139
24.6%
n 139
24.6%
T 5
 
0.9%
Z 5
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24840
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7651
30.8%
/ 3274
13.2%
1 3203
12.9%
2 2844
 
11.4%
: 1647
 
6.6%
1637
 
6.6%
6 1108
 
4.5%
7 668
 
2.7%
5 560
 
2.3%
4 515
 
2.1%
Other values (11) 1733
 
7.0%

TCP_CONVERSATION_EXCHANGE
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct103
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.261089
Minimum0
Maximum1194
Zeros657
Zeros (%)36.9%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:10.108778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median7
Q322
95-th percentile55
Maximum1194
Range1194
Interquartile range (IQR)22

Descriptive statistics

Standard deviation40.500975
Coefficient of variation (CV)2.490668
Kurtosis453.42612
Mean16.261089
Median Absolute Deviation (MAD)7
Skewness17.609832
Sum28961
Variance1640.329
MonotonicityNot monotonic
2022-11-30T19:44:10.269775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 657
36.9%
7 56
 
3.1%
8 53
 
3.0%
5 48
 
2.7%
4 48
 
2.7%
6 39
 
2.2%
15 36
 
2.0%
10 35
 
2.0%
12 32
 
1.8%
9 31
 
1.7%
Other values (93) 746
41.9%
ValueCountFrequency (%)
0 657
36.9%
1 16
 
0.9%
2 13
 
0.7%
3 28
 
1.6%
4 48
 
2.7%
5 48
 
2.7%
6 39
 
2.2%
7 56
 
3.1%
8 53
 
3.0%
9 31
 
1.7%
ValueCountFrequency (%)
1194 1
0.1%
709 1
0.1%
326 1
0.1%
288 1
0.1%
226 1
0.1%
208 2
0.1%
197 1
0.1%
188 1
0.1%
185 1
0.1%
157 1
0.1%

DIST_REMOTE_TCP_PORT
Real number (ℝ)

HIGH CORRELATION
SKEWED
ZEROS

Distinct66
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4727681
Minimum0
Maximum708
Zeros916
Zeros (%)51.4%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:10.423775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q35
95-th percentile26
Maximum708
Range708
Interquartile range (IQR)5

Descriptive statistics

Standard deviation21.807327
Coefficient of variation (CV)3.9846978
Kurtosis642.40018
Mean5.4727681
Median Absolute Deviation (MAD)0
Skewness21.890937
Sum9747
Variance475.55951
MonotonicityNot monotonic
2022-11-30T19:44:10.572775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 916
51.4%
3 151
 
8.5%
1 106
 
6.0%
2 78
 
4.4%
4 69
 
3.9%
6 60
 
3.4%
5 53
 
3.0%
7 49
 
2.8%
9 31
 
1.7%
8 26
 
1.5%
Other values (56) 242
 
13.6%
ValueCountFrequency (%)
0 916
51.4%
1 106
 
6.0%
2 78
 
4.4%
3 151
 
8.5%
4 69
 
3.9%
5 53
 
3.0%
6 60
 
3.4%
7 49
 
2.8%
8 26
 
1.5%
9 31
 
1.7%
ValueCountFrequency (%)
708 1
0.1%
317 1
0.1%
279 1
0.1%
98 1
0.1%
89 1
0.1%
73 1
0.1%
67 1
0.1%
60 1
0.1%
59 1
0.1%
58 2
0.1%

REMOTE_IPS
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct18
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.0606401
Minimum0
Maximum17
Zeros657
Zeros (%)36.9%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:10.712776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q35
95-th percentile10
Maximum17
Range17
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.3869753
Coefficient of variation (CV)1.1066232
Kurtosis0.78900501
Mean3.0606401
Median Absolute Deviation (MAD)2
Skewness1.1303713
Sum5451
Variance11.471602
MonotonicityNot monotonic
2022-11-30T19:44:10.820775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
0 657
36.9%
2 191
 
10.7%
3 183
 
10.3%
5 134
 
7.5%
4 128
 
7.2%
1 101
 
5.7%
6 97
 
5.4%
7 80
 
4.5%
8 67
 
3.8%
9 42
 
2.4%
Other values (8) 101
 
5.7%
ValueCountFrequency (%)
0 657
36.9%
1 101
 
5.7%
2 191
 
10.7%
3 183
 
10.3%
4 128
 
7.2%
5 134
 
7.5%
6 97
 
5.4%
7 80
 
4.5%
8 67
 
3.8%
9 42
 
2.4%
ValueCountFrequency (%)
17 1
 
0.1%
16 2
 
0.1%
15 5
 
0.3%
14 9
 
0.5%
13 8
 
0.4%
12 21
 
1.2%
11 25
 
1.4%
10 30
1.7%
9 42
2.4%
8 67
3.8%

APP_BYTES
Real number (ℝ)

HIGH CORRELATION
SKEWED
ZEROS

Distinct825
Distinct (%)46.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2982.3391
Minimum0
Maximum2362906
Zeros657
Zeros (%)36.9%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:10.967775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median672
Q32328
95-th percentile6235
Maximum2362906
Range2362906
Interquartile range (IQR)2328

Descriptive statistics

Standard deviation56050.575
Coefficient of variation (CV)18.794165
Kurtosis1768.3952
Mean2982.3391
Median Absolute Deviation (MAD)672
Skewness41.980994
Sum5311546
Variance3.1416669 × 109
MonotonicityNot monotonic
2022-11-30T19:44:11.125775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 657
36.9%
432 20
 
1.1%
366 20
 
1.1%
498 18
 
1.0%
564 12
 
0.7%
474 11
 
0.6%
696 11
 
0.6%
486 9
 
0.5%
420 9
 
0.5%
618 8
 
0.4%
Other values (815) 1006
56.5%
ValueCountFrequency (%)
0 657
36.9%
54 1
 
0.1%
66 6
 
0.3%
90 8
 
0.4%
128 1
 
0.1%
132 3
 
0.2%
198 3
 
0.2%
202 1
 
0.1%
238 1
 
0.1%
264 2
 
0.1%
ValueCountFrequency (%)
2362906 1
0.1%
99843 1
0.1%
26631 1
0.1%
23383 1
0.1%
20749 1
0.1%
20074 1
0.1%
18084 1
0.1%
15162 1
0.1%
14530 1
0.1%
14064 1
0.1%

SOURCE_APP_PACKETS
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct113
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.540146
Minimum0
Maximum1198
Zeros655
Zeros (%)36.8%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:11.285775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median8
Q326
95-th percentile61
Maximum1198
Range1198
Interquartile range (IQR)26

Descriptive statistics

Standard deviation41.627173
Coefficient of variation (CV)2.2452452
Kurtosis407.84776
Mean18.540146
Median Absolute Deviation (MAD)8
Skewness16.308302
Sum33020
Variance1732.8216
MonotonicityNot monotonic
2022-11-30T19:44:11.444775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 655
36.8%
5 43
 
2.4%
4 42
 
2.4%
8 38
 
2.1%
6 38
 
2.1%
11 35
 
2.0%
14 34
 
1.9%
7 32
 
1.8%
23 31
 
1.7%
16 28
 
1.6%
Other values (103) 805
45.2%
ValueCountFrequency (%)
0 655
36.8%
1 15
 
0.8%
2 14
 
0.8%
3 25
 
1.4%
4 42
 
2.4%
5 43
 
2.4%
6 38
 
2.1%
7 32
 
1.8%
8 38
 
2.1%
9 25
 
1.4%
ValueCountFrequency (%)
1198 1
0.1%
709 1
0.1%
330 1
0.1%
294 1
0.1%
228 2
0.1%
210 1
0.1%
200 1
0.1%
194 1
0.1%
187 1
0.1%
162 1
0.1%

REMOTE_APP_PACKETS
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct116
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.74621
Minimum0
Maximum1284
Zeros590
Zeros (%)33.1%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:11.634778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median9
Q325
95-th percentile62
Maximum1284
Range1284
Interquartile range (IQR)25

Descriptive statistics

Standard deviation46.397969
Coefficient of variation (CV)2.4750586
Kurtosis373.60708
Mean18.74621
Median Absolute Deviation (MAD)9
Skewness15.995364
Sum33387
Variance2152.7715
MonotonicityNot monotonic
2022-11-30T19:44:11.832842image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 590
33.1%
5 48
 
2.7%
4 48
 
2.7%
2 43
 
2.4%
9 42
 
2.4%
3 42
 
2.4%
7 40
 
2.2%
12 40
 
2.2%
6 40
 
2.2%
10 39
 
2.2%
Other values (106) 809
45.4%
ValueCountFrequency (%)
0 590
33.1%
1 1
 
0.1%
2 43
 
2.4%
3 42
 
2.4%
4 48
 
2.7%
5 48
 
2.7%
6 40
 
2.2%
7 40
 
2.2%
8 35
 
2.0%
9 42
 
2.4%
ValueCountFrequency (%)
1284 1
0.1%
837 1
0.1%
442 1
0.1%
431 1
0.1%
284 1
0.1%
278 1
0.1%
263 1
0.1%
255 1
0.1%
217 1
0.1%
216 1
0.1%

SOURCE_APP_BYTES
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct885
Distinct (%)49.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15892.546
Minimum0
Maximum2060012
Zeros590
Zeros (%)33.1%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:12.025838image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median579
Q39806
95-th percentile66124
Maximum2060012
Range2060012
Interquartile range (IQR)9806

Descriptive statistics

Standard deviation69861.93
Coefficient of variation (CV)4.395893
Kurtosis460.95739
Mean15892.546
Median Absolute Deviation (MAD)579
Skewness18.275493
Sum28304624
Variance4.8806892 × 109
MonotonicityNot monotonic
2022-11-30T19:44:12.492837image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 590
33.1%
124 43
 
2.4%
244 31
 
1.7%
186 29
 
1.6%
306 24
 
1.3%
442 10
 
0.6%
562 9
 
0.5%
310 9
 
0.5%
438 9
 
0.5%
372 8
 
0.4%
Other values (875) 1019
57.2%
ValueCountFrequency (%)
0 590
33.1%
62 1
 
0.1%
124 43
 
2.4%
182 1
 
0.1%
184 6
 
0.3%
186 29
 
1.6%
190 5
 
0.3%
213 1
 
0.1%
244 31
 
1.7%
246 7
 
0.4%
ValueCountFrequency (%)
2060012 1
0.1%
1058608 1
0.1%
947971 1
0.1%
488313 1
0.1%
486769 1
0.1%
466055 1
0.1%
383760 1
0.1%
298694 1
0.1%
295213 1
0.1%
284743 1
0.1%

REMOTE_APP_BYTES
Real number (ℝ)

HIGH CORRELATION
SKEWED
ZEROS

Distinct822
Distinct (%)46.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3155.5985
Minimum0
Maximum2362906
Zeros655
Zeros (%)36.8%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:12.656840image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median735
Q32701
95-th percentile6657
Maximum2362906
Range2362906
Interquartile range (IQR)2701

Descriptive statistics

Standard deviation56053.78
Coefficient of variation (CV)17.76328
Kurtosis1767.47
Mean3155.5985
Median Absolute Deviation (MAD)735
Skewness41.964566
Sum5620121
Variance3.1420263 × 109
MonotonicityNot monotonic
2022-11-30T19:44:12.835859image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 655
36.8%
366 20
 
1.1%
432 20
 
1.1%
498 18
 
1.0%
474 12
 
0.7%
564 12
 
0.7%
696 11
 
0.6%
276 9
 
0.5%
420 9
 
0.5%
486 9
 
0.5%
Other values (812) 1006
56.5%
ValueCountFrequency (%)
0 655
36.8%
54 1
 
0.1%
66 5
 
0.3%
90 8
 
0.4%
132 3
 
0.2%
146 2
 
0.1%
198 3
 
0.2%
202 1
 
0.1%
206 1
 
0.1%
264 2
 
0.1%
ValueCountFrequency (%)
2362906 1
0.1%
100151 1
0.1%
26931 1
0.1%
23877 1
0.1%
21646 1
0.1%
21187 1
0.1%
18384 1
0.1%
15314 1
0.1%
14688 1
0.1%
14522 1
0.1%

APP_PACKETS
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct113
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.540146
Minimum0
Maximum1198
Zeros655
Zeros (%)36.8%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:13.009839image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median8
Q326
95-th percentile61
Maximum1198
Range1198
Interquartile range (IQR)26

Descriptive statistics

Standard deviation41.627173
Coefficient of variation (CV)2.2452452
Kurtosis407.84776
Mean18.540146
Median Absolute Deviation (MAD)8
Skewness16.308302
Sum33020
Variance1732.8216
MonotonicityNot monotonic
2022-11-30T19:44:13.191841image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 655
36.8%
5 43
 
2.4%
4 42
 
2.4%
8 38
 
2.1%
6 38
 
2.1%
11 35
 
2.0%
14 34
 
1.9%
7 32
 
1.8%
23 31
 
1.7%
16 28
 
1.6%
Other values (103) 805
45.2%
ValueCountFrequency (%)
0 655
36.8%
1 15
 
0.8%
2 14
 
0.8%
3 25
 
1.4%
4 42
 
2.4%
5 43
 
2.4%
6 38
 
2.1%
7 32
 
1.8%
8 38
 
2.1%
9 25
 
1.4%
ValueCountFrequency (%)
1198 1
0.1%
709 1
0.1%
330 1
0.1%
294 1
0.1%
228 2
0.1%
210 1
0.1%
200 1
0.1%
194 1
0.1%
187 1
0.1%
162 1
0.1%

DNS_QUERY_TIMES
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct10
Distinct (%)0.6%
Missing1
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean2.2634831
Minimum0
Maximum20
Zeros976
Zeros (%)54.8%
Negative0
Negative (%)0.0%
Memory size14.0 KiB
2022-11-30T19:44:13.379841image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q34
95-th percentile8
Maximum20
Range20
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.9308526
Coefficient of variation (CV)1.2948418
Kurtosis0.80566895
Mean2.2634831
Median Absolute Deviation (MAD)0
Skewness1.1226261
Sum4029
Variance8.5898968
MonotonicityNot monotonic
2022-11-30T19:44:13.504839image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0 976
54.8%
4 309
 
17.3%
6 213
 
12.0%
2 142
 
8.0%
8 105
 
5.9%
10 19
 
1.1%
12 12
 
0.7%
14 2
 
0.1%
20 1
 
0.1%
9 1
 
0.1%
(Missing) 1
 
0.1%
ValueCountFrequency (%)
0 976
54.8%
2 142
 
8.0%
4 309
 
17.3%
6 213
 
12.0%
8 105
 
5.9%
9 1
 
0.1%
10 19
 
1.1%
12 12
 
0.7%
14 2
 
0.1%
20 1
 
0.1%
ValueCountFrequency (%)
20 1
 
0.1%
14 2
 
0.1%
12 12
 
0.7%
10 19
 
1.1%
9 1
 
0.1%
8 105
 
5.9%
6 213
 
12.0%
4 309
 
17.3%
2 142
 
8.0%
0 976
54.8%

Type
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size14.0 KiB
0
1565 
1
216 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1781
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1565
87.9%
1 216
 
12.1%

Length

2022-11-30T19:44:13.642843image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-11-30T19:44:13.772838image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0 1565
87.9%
1 216
 
12.1%

Most occurring characters

ValueCountFrequency (%)
0 1565
87.9%
1 216
 
12.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1781
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1565
87.9%
1 216
 
12.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1781
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1565
87.9%
1 216
 
12.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1781
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1565
87.9%
1 216
 
12.1%

Interactions

2022-11-30T19:44:04.576778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:29.391149image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:34.022306image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:36.847831image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:39.490348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:43.122350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:46.187350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:49.709349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:52.787346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:55.698873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:57.970873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:00.182900image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:02.091781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:04.822778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:29.730255image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:34.221301image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:37.114346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:39.704353image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:43.334352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:46.414346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:49.981351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:52.954347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:55.925871image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:58.116872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:00.310891image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:02.272774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:05.023775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:30.029254image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:34.602851image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:37.358345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:39.897344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:43.561349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:46.621346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:50.277349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:53.359345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:56.127869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:58.317874image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:00.459870image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:02.453780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:05.223780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:30.311254image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:34.887830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:37.550348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:40.097344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:43.806345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:46.880349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:50.643345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:53.530346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:56.323871image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:58.523872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:00.613873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:02.640776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:05.400776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:30.583255image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:35.088829image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:37.757345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:40.295364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:44.051348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:47.226349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:50.921350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:53.699360image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:56.500875image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:58.698869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:00.758869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:02.812779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:05.610777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:30.848255image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:35.303829image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:37.945349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:40.523349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:44.342353image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:47.576351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:51.139346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:53.890346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:56.664878image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:58.872869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:00.903873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:02.958776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:05.824776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:31.075256image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:35.527831image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:38.125347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:40.723348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:44.681346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:47.780347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:51.338349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:54.056345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:56.817869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:59.028871image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:01.044869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:03.347783image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:05.992799image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:31.292255image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:35.730830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:38.300351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:40.914346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:44.907353image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:47.974350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:51.519347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:54.226345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:56.999875image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:59.171868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:01.174884image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:03.490775image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:06.173780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:31.562256image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:35.934845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:38.488345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:41.191347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:45.094347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:48.334357image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:51.715350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:54.424349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:57.201869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:59.336872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:01.311873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:03.635778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:06.338776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:31.927262image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:36.118833image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:38.656347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:41.671350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:45.278348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:48.694349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:51.900348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:54.733873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:57.381872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:59.561869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:01.454869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:03.773804image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:06.482777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:32.258305image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:36.283829image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:38.845351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:42.035358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:45.512364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:48.927350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:52.180348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:54.986870image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:57.538870image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:59.726877image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:01.577875image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:03.911806image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:06.621773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:32.595305image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:36.457824image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:39.019348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:42.435356image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:45.742348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:49.131351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:52.398347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:55.206874image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:57.678898image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:59.884871image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:01.716876image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:04.066777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:06.809786image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:33.789305image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:36.624824image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:39.195360image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:42.791351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:45.951347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:49.324345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:52.607352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:55.401876image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:43:57.813875image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:00.024873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:01.891874image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-30T19:44:04.214809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-11-30T19:44:13.886838image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-11-30T19:44:14.247838image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-30T19:44:14.536839image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-30T19:44:14.812845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-30T19:44:15.054842image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-11-30T19:44:15.212844image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-30T19:44:07.115776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-30T19:44:07.447777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-11-30T19:44:07.667773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

URLURL_LENGTHNUMBER_SPECIAL_CHARACTERSCHARSETSERVERCONTENT_LENGTHWHOIS_COUNTRYWHOIS_STATEPROWHOIS_REGDATEWHOIS_UPDATED_DATETCP_CONVERSATION_EXCHANGEDIST_REMOTE_TCP_PORTREMOTE_IPSAPP_BYTESSOURCE_APP_PACKETSREMOTE_APP_PACKETSSOURCE_APP_BYTESREMOTE_APP_BYTESAPP_PACKETSDNS_QUERY_TIMESType
0M0_109167iso-8859-1nginx263.0NoneNone10/10/2015 18:21None702700910115383292.01
1B0_2314166UTF-8Apache/2.4.1015087.0NoneNoneNoneNone17741230171912651230170.00
2B0_911166us-asciiMicrosoft-HTTPAPI/2.0324.0NoneNoneNoneNone0000000000.00
3B0_113176ISO-8859-1nginx162.0USAK7/10/1997 4:0012/09/2013 0:453122338123937187844380398.00
4B0_403176UTF-8None124140.0USTX12/05/1996 0:0011/04/2017 0:005725427861621298894586614.00
5B0_2064187UTF-8nginxNaNSCMahe3/08/2016 14:303/10/2016 3:4511698941113838894110.00
6B0_462186iso-8859-1Apache/2345.0USCO29/07/2002 0:001/07/2016 0:0012031189141385591327142.00
7B0_1128196us-asciiMicrosoft-HTTPAPI/2.0324.0USFL18/03/1997 0:0019/03/2017 0:000000000000.00
8M2_17205utf-8nginx/1.10.1NaNNoneNone8/11/2014 7:41None00002321314622.01
9M3_75205utf-8nginx/1.10.1NaNNoneNone8/11/2014 7:41None0000216214622.01
URLURL_LENGTHNUMBER_SPECIAL_CHARACTERSCHARSETSERVERCONTENT_LENGTHWHOIS_COUNTRYWHOIS_STATEPROWHOIS_REGDATEWHOIS_UPDATED_DATETCP_CONVERSATION_EXCHANGEDIST_REMOTE_TCP_PORTREMOTE_IPSAPP_BYTESSOURCE_APP_PACKETSREMOTE_APP_PACKETSSOURCE_APP_BYTESREMOTE_APP_BYTESAPP_PACKETSDNS_QUERY_TIMESType
1771M4_4317017UTF-8ApacheNaNESBarcelona17/09/2008 0:002/09/2016 0:00000002124000.01
1772M4_6117334UTF-8ApacheNaNESBarcelona17/09/2008 0:002/09/2016 0:0011190154169010.01
1773M4_3917816UTF-8ApacheNaNESBarcelona17/09/2008 0:002/09/2016 0:00000003186000.01
1774B0_15618329ISO-8859-1Microsoft-IIS/7.5; litigation_essentials.lexisnexis.com 99994890.0USNY26/06/1997 0:0018/11/2014 0:0022272062302681612742308.00
1775M4_4519417UTF-8ApacheNaNESBarcelona17/09/2008 0:002/09/2016 0:00000003186000.01
1776M4_4819416UTF-8ApacheNaNESBarcelona17/09/2008 0:002/09/2016 0:00000003186000.01
1777M4_4119817UTF-8ApacheNaNESBarcelona17/09/2008 0:002/09/2016 0:00000002124000.01
1778B0_16220134utf-8Apache/2.2.16 (Debian)8904.0USFL15/02/1999 0:0015/07/2015 0:008326663187891321816945874.00
1779B0_115223434ISO-8859-1cloudflare-nginxNaNUSCA1/04/1998 0:009/12/2016 0:000000000000.00
1780B0_67624940utf-8Microsoft-IIS/8.524435.0USWisconsin14/11/2008 0:0020/11/2013 0:00196112314252830392776256.00